Automated MUltiscale simulation environment

Multiscale techniques integrating detailed atomistic information on materials and reactions to predict the performance of heterogeneous catalytic full-scale reactors have been suggested but lack seamless implementation. The largest challenges in the multiscale modeling of reactors can be grouped into two main categories: catalytic complexity and the difference between time and length scales of chemical and transport phenomena. Here we introduce the Automated MUltiscale Simulation Environment AMUSE, a workflow that starts from Density Functional Theory (DFT) data, automates the analysis of the reaction networks through graph theory, prepares it for microkinetic modeling, and subsequently integrates the results into a standard open-source Computational Fluid Dynamics (CFD) code. We demonstrate the capabilities of AMUSE by applying it to the unimolecular iso-propanol dehydrogenation reaction and then, increasing the complexity, to the pre-commercial Pd/In2O3 catalyst employed for the CO2 hydrogenation to methanol. The results show that AMUSE allows the computational investigation of heterogeneous catalytic reactions in a comprehensive way, providing essential information for catalyst design from the atomistic to the reactor scale level.

mechanism and reaction energy profiles constructed by AutoProfLib are subsequently used by pyMKM Python library to automatically set the microkinetic analysis.

Note S1.1: PreProcessor class structure
The PreProcessor class is designed to pre-treat computational data, and its workflow is depicted in Figure S1.It has the following capabilities: • Converts the CONTCAR file to an .xyzfile that contains the Cartesian coordinates of the molecule (coordinates).
• Assembles the molecular structure (if requested) by processing the CONTCAR file (apply_pbc).
• Computes the Gibbs or Helmholtz free energies of the parsed systems (Gibbs, Helmholtz).
The PreProcessor class processes the structure (either a CONTCAR or a .xyzfile), energies, and frequencies (from OUTCAR) in the user-specified directory.The user can apply the Periodic Boundary Condition for adsorbed intermediates to assemble them if required, and the coordinates of the reconstructed intermediate will be exported to the _pbc.xyzfile.If the Gibbs method is used, the molecule's symmetry can be determined based on the .xyzfile, and the frequencies are processed according to the procedure presented in Note S1.2.The processing of the frequencies is required to tackle the low-frequency values, which contributions artificially increase the vibrational entropy.Subsequently, regardless of the method used (Gibbs or Helmholtz), the frequencies are processed using the Erase, Substitute, or Grimme methods, based on the user's choice.The frequency list is then exported to the freq.txtfile, and the Gibbs or Helmholtz free energies are computed.Finally, the energy_summary.txtfile, containing the different contributions to the free energies and their corresponding values, is exported as the final summary.

PreProcessor class utilization
We provide the UsePreProcessor.py to use the PreProcessor class in the Supporting Link.The AutoProfLib.pyfile must be present in the same directory.The UsePreProcessor.py script can be used to calculate the Gibbs and Helmholtz free energies separately by typing "python <PATH-TO-UsePreProcessor.py>" in the conda prompt or any other Python interpreter.The potential outputs include a .xyzfile with the atomic coordinates, the frequency list (freq.txt), the _pbc.xyzfile, and the energy_summary.txtfile, containing the Gibbs or Helmholtz free energy values for the molecule or adsorbed intermediate specified by the user.

Gibbs and Helmholtz free energy calculation
The UsePreProcessor.py can calculate either Gibbs or Helmholtz free energies.Here its usage will be described in the example of calculating Gibbs free energies, with the explanation in the last paragraph about the differences for estimating Helmholtz free energies.
To estimate the Gibbs free energies the optimization CONTCAR and the OUTCAR files are required.They should be placed either inside the directory with the geometry optimization results or on the same level.The output will be an .xyzfile with atomic coordinates, a frequency list (freq.txt),and an energy_summary.txtfile containing the calculated values of Gibbs free energy for the specified molecule.To execute the script, follow these steps: • Place UsePreProcessor.py and AutoProfLib.py in the directory with the geometry optimization results.
• In the Anaconda prompt (or any other Python interpreter), go to the desired location and type python UsePreProcessor.py to start the script execution (the UsePreProcessor.pyfile must be in the same folder as AutoProfLib.py).
• The program will ask for the type of free energy to be calculated, which in this case is Gibbs.
• Next, the program will ask for the name of the directory where the geometry optimization results are located.This directory requires CONTCAR and OUTCAR files, and the program assumes that a directory named FREQ or ../Freq, containing the OUTCAR from the frequency calculations, is located in the same directory or one level up.
• Afterwards, the program will ask for the initial conditions in which the Gibbs free energy will be calculated: -Temperature (T).
-Spin of the molecule.
-The program will also ask how to treat frequencies of low values, with three options to choose from: Erase, Substitute, or Grimme.Under the Grimme method, the vibrational and rotational contributions are weighted according to the method proposed by Grimme et al. 1 in Reference 1, and the contributions for the lowest frequencies are penalized (tending towards 0).Additionally, the user can use the "average inertia momentum" approximation by setting max_freq [2] = True.
Otherwise, the Erase and Substitute options remove or substitute the frequencies below the threshold defined by the user (in cm −1 , the recommended threshold is 100 cm −1 ).
To calculate Helmholtz free energies, the launching procedure of the UsePreProcessor.pyscript is the same as for the Gibbs free energies.During its execution the user will be asked about the desired temperature and the way of processing the frequencies.The program will also ask if to apply Periodic Boundary Conditions to assemble the structure.If the answer will be yes, then, the program will estimate the maximum threshold (in partial coordinates) to apply the PBC (max_pbc in Inputs) based on the provided surface (typically, over 2/3rds of the surface).If the user agrees with the estimation, the PBC will be applied.Finally, the _pbc.xyzfile will be generated in the work folder, the new coordinates will be printed as a list of dictionaries containing the atom label and the 3 spatial coordinates.After that, the Helmholtz free energy will be calculated.The frequency list and energy summary will also be saved in the path specified by the user.An example of calculating the Helmholtz free energy can be found in the Supporting Link.

Note S1.2: Estimation of Gibbs and Helmholtz free energies
In this section, we will present the methodology and equations used in this work to calculate Helmholtz and Gibbs free energies, which were extracted from the documentation of the Atomic Simulation Environment (ASE) Python library. 2 Gibbs free energy: The ideal gas approximation is commonly used to compute the Gibbs free energy (Equation S1).According to this approximation, particles are treated as punctual and do not interact with each other.First, the enthalpy H(T ) is estimated as a function of the integral of the heat capacity at constant pressure (Equation S2).The heat capacity C v is the derivative of the energy as a function of temperature, and is divided into translational, rotational, and vibrational contributions, as shown in Expression S3.The Zero Point Energy correction is given by E ZP E = i hν i 2 , where ν i represents the frequencies obtained using DFT.The number of frequencies used depends on the molecule's geometry: 3N -6 for nonlinear molecules, 3N -5 for linear molecules, and 3 for monoatomic gases, where N represents the number of atoms in the molecule. 3T, P ) = H(T ) − S(T, P ) • T (S1) PBR Next, the entropy can be estimated by applying a similar decomposition as that used for C v , as shown in Expression S4.Here, P 0 represents the standard pressure (usually 1.013•10 5 Pa), M refers to the mass of the gas in S trans , I denotes the inertia momentum in S rot , and s represents the spin of the molecule in S electronic .
Helmholtz free energy: The Helmholtz free energy (Equation S5) is computed considering only the vibrational contributions to heat capacity and entropy, and employs the 3N frequencies ν.First, the internal energy U is estimated using Equation S6.Then, the entropy is calculated using S vibrational from Expression S4.Once the adjacency matrix for all the molecules parsed by AutoProfLib has been defined, the connectivity dictionaries are generated.As illustrated in Figure S3, the connectivity dictionaries contain the following keys: • Empty: this key determines if the intermediate is the empty surface, the empty surface with a vacancy, or neither.The value of the key is 1 for the first two cases and "" for the last.
• One label for each element in the system.In the example depicted in S3 d), these keys correspond to C, O, and H.The corresponding value is a list comprising the total number of bonds of all the atoms labeled as the key in the intermediate, the number of atoms labeled as the key but not bonded to the principal chain, and the total number of atoms labeled as the key present in the intermediate.
Thus, the example connectivity dictionary can be interpreted as follows: the species consists of 3 carbon atoms bonded together, one of which is attached to 2 other carbon atoms, 1 H, and 1 O.The remaining two carbon atoms are bonded to 1 carbon and 3 H each.Moreover, there are no free oxygen or hydrogen atoms in the system, and the oxygen atom is bonded to 1 H.
Subsequently, the connectivity dictionaries are employed to establish connections between different intermediates and build the mechanism as a graph using the reactive test functions.
To match the intermediates (e.g.n i and n j ) a node comparison (i.e.reactive test) function is used.Within this function, the reactive processes should be specified, which involves adding atoms (e.g.hydrogenation, halogenation, oxidation, gas species adsorptions) and atoms separation (e.g. via bond breaking and desorptions).If any transformation on n i results in n j , such that n i,transf ormed = n j , then f(n i , n j ) = 1 and n i and n j are connected.
Otherwise, f(n i , n j ) = 0 and n i and n j are not connected.The outcome of applying the function f(n i , n j ) to all species in the reaction network is the system adjacency matrix with elements a ij equal to 1 if intermediate i is connected to j and 0 otherwise.
To exemplify how the reactive test functions operate, we have used the oxygen_hydrogenation_test on the state containing adsorbed CH 3 CHOCH 3 +H molecules (labeled as i3) and the state containing adsorbed iso-propanol (labeled as i2), as shown in Figure S3 a) through c).The oxygen_hydrogenation_test generates a copy of i3 and adds a hydrogen bond to the oxygen of that copy by modifying the O and H entries (Figure S3 d) to e)).Subsequently, the modified version of i3's connectivity dictionary is compared to all other intermediates.If the modified dictionary is identical to another intermediate, those two states are connected.In the example in Figure S3, i3 and i2 states are connected.
Then, by applying the test functions iteratively to all the intermediates, the mechanism graph is generated.Finally, utilizing this graph, the connections between intermediates are related to the energies for each one of the intermediates and transition states, applying the reference stated by the user, and generating the reaction energy profile.

Note S2: Details on PyMKM
The inputs for the PyMKM are the reaction mechanism, the energy profile, and the operating conditions (T, P, gas phase and surface compositions) generated with the AutoProfLib, The mechanism input is parsed using the rm_parser.pyfile, provided in the Supporting Link.This file, named rm.mkm,includes the following: 1.The first few lines represent global reactions in the gas phase, with each global reaction having the syntax: Label of the global reaction (to be set by the user) index of the main product desorption reaction: global reaction in gas phase (automatically generated).
2. Three blank lines are used to separate the global reactions from the elementary steps.
3. A list of elementary steps follows.Each elementary step has the following syntax: reactant 1 + reactant 2 → product 1 + product 2.The three last characters of the gas phase molecules should be "(g)".
The energy input file, g.mkm, is parsed using the g_parser.py.The syntax of g.mkm is as follows: 1.The first part of the file consists of a list of TS energies.The number of entries should be equal to the number of elementary reactions present in the mechanism file.Each entry in the TS energies list has the following syntax: internal energy (or enthalpy), blank space, entropy in electron-Volt per K at certain temperature (eV•K −1 ).The syntax is general for all the input energy files.
2. Three blank lines separate the TS energies list from the next section.
3. The next section is a list of energies of intermediates.
4. Three blank lines separate the energies of the intermediates list from the next section.
5. The last section is a list of energies of gas species.In the example shown in the Supporting Link, the gas phase values are 0 because the provided energies are referenced.
Given these inputs, the stoichiometric coefficients, barriers, and reaction energy increments for each elementary reaction in the mechanism are automatically calculated by pyMKM.These parameters are used to calculate the kinetic constants (k) via the Eyring equation (Equation S7) for surface reactions, where k B is the Boltzmann constant, h is the Planck constant, E a is the activation energy for the elementary step, and T is the temperature in Kelvin.For adsorption/desorption processes, the Knudsen-Hertz equation is used instead (Equation S8), where p is the partial pressure of the adsorbed gas species, A is the adsorption site area, and M is the mass of the gas species.The user can select the type of reactor to be simulated, either a differential or a dynamic Continuous Stirred Tank Reactor (CSTR), both implemented in the reactor.pyfile.For the CSTR reactor, the user can set the radius (radius) and the length (length) of the reactor, the inlet volumetric flow rate (Q), the catalyst mass (m cat ), the BET surface (S BET ), and the area of the active site (A site ).
After calculating the direct kinetic constants, the reverse kinetic constants can be estimated as illustrated in Figure 3 b) in the main text, which involves calculating the equilibrium constant K eq from the exponential term of the Arrhenius equation and the free energy difference between the initial and final states instead of the barrier.These tasks are performed with the utilities contained in the thermo.pylibrary, developed at ICIQ, which can also be used for thermodynamic consistency testing.Additionally, a reversibility test can be set by the user. 4he Ordinary Differential Equation (ODE) system is automatically generated with this information, as explained in the main text, and then solved using the single_run method.
This method iteratively modifies the temperature and pressure values to calculate the apparent reaction orders and energy.
The Degree of Rate Control (DRC) 5 can also be calculated by the user using Equation S9, where r + is the direct rate and G T S j is the free energy of the studied transition state.The Degree of Selectivity Control (DSC) can also be calculated by performing slight perturbations in the intermediate and TS energy values using the single_run method.Additionally, it is also possible to estimate the apparent reaction orders of the reactants.
To ensure the accuracy and usability of our microkinetic framework, we reproduced the example reported in the work of Filiot. 6This example is the mechanism of the transformation of generic species A into B in a single unimolecular step, including the adsorption and desorption of reactant and product.To perform a 1-to-1 comparison with the benchmark, we adapted the pre-exponential factor of the kinetic constant used in the example, which was approximated to 10 13 instead of

Note S3: Details on the CatalyticFoam simulations
The reactive flow at the catalyst grain scale is described by the multispecies, compressible, Navier-Stokes equations.We used CatalyticFoam to perform computational fluid dynamics (CFD) simulations in this work, based on the general multi-physics solver OpenFOAM 7-9 and solving the conservation of total mass, mixture momentum, mass fraction for each species and mixture energy.Technically, since AMUSE is fully modular, it would be possible to combine it with any other CFD after adapting the input/output functionalities of our workflow.
However, AMUSE is specifically designed to be combined with CatalyticFoam, which is an accurate, versatile, and robust tool.By means of CatalyticFoam, homogeneous gas-phase reactions can be included and a specific boundary treatment of heterogeneous catalytic walls for mass and heat balance was included to couple the microkinetic models.In addition, a splitting-operator approach is used to solve the species mass fraction and energy equations by successively solving and updating fields for reaction and convection-diffusion sub-problems within a time iteration.The catalytic surface coverage evolution of individual adsorbed species is also updated during the reaction operator.The CatalyticFOAM solver enables the treatment of reaction on the catalyst particle surface and the physical process related to reaction-diffusion within the catalyst grain is not considered in this paper.In the present study, a mesh convergence study on a single isolated sphere is performed in the flow regime of interest (Re = ρvdp µ ≈ 5), estimated from 10 to assess the grid resolution independence of the computed solution.Here, ρ stands for the density of the gaseous phase, v is the gas velocity, d p is the catalyst particle diameter and µ is the gas viscosity.For the case of a single sphere, 10 different meshes were generated corresponding to grid resolution dp ∆x ranging from 5 to 100.In addition, a boundary layer composed of 5 layers with an expansion ratio of 1.15 is added close to the sphere surface to ensure that the code captures the concentration gradients.The meshes are realized using the blockMesh and snappyHexMesh utilities from the OpenFOAM software.For each case, the overall conversion is followed until steady-state, and shown in Table S2.From Table S2, a grid resolution corresponding to dp ∆x = 60 is fine enough to fall below 5% error with the M9 reference case with reasonable computational time for Co and M7 is required for Pd and In 2 O 3 -based catalysts.
To ensure accuracy in our simulations, a mesh convergence study was also conducted over a reactor bed for both iso-propanol and the CO 2 hydrogenation systems.In order to avoid grid snapping problems of spheres with the snappyHexMesh for complex bed configurations and following the guidelines of the resolution found for a single isolated sphere the opensource Salome meshing tools were used to generate the meshes used in this work.It was found that a resolution of dp ∆x ≈ 50 is dense enough to properly capture the complexity of the system, obtaining the most robust estimations.Then, for the CO 2 system, the results of the mesh convergence test are shown in Table S1, where we observed that conversion was only achieved for the denser meshes ( dp ∆x ≈ 60).Considering the tests for both systems, the global results suggest that the branching of the chemical mechanism and the type of catalyst (either metal or oxide) also play an important role in CFD simulation computational performance (i.e.reducing mechanism complexity may drastically improve the CFD performances).To run the simulation, the user needs to find the NASA polynomials in CHEMKIN syntax for each species in the system, which can be obtained from databases such as the 3rd Millennium Database. 11The user also needs to find the transport parameters for each species, which can be found in the OpenFOAM examples database.Finally, the kinetic, thermochemical, and transport information is processed using the CatalyticFoam_CHEMKINPreProcessor script, which is included in CatalyticFoam.

Note S4: Key descriptors identification for CO 2 hydrogenation systems
We performed the Degree of Rate Control (DRC) analysis 5 to identify the most relevant elementary steps in a given mechanism.As an example, we report the DRC results for the CP a model in Table S7.However, it has to be kept in mind that the DRC may differ for each In 2 O 3 -based model.As described in the main text, R15 reaction, H 3 CO + H to H 3 COH, was identified as crucial for MeOH formation, and R12, H 2 COH to H 2 CO + OH, was chosen to describe the concurring RWGS reaction.We then decided to try to create a correlation between the calculated apparent activation energies for the various Pd/In 2 O 3 (including In 2 O 3 by itself) materials, and reaction energies of R12 and R15 steps.For that, we have used Random Forest model, implemented in the Analytic nodes of KNIME opensource software.The number of used features was limited to 2, the depth of a decision tree was also limited to 2, and 150 trees were created for a random forest run.To test the model during its creation, leave-one-out method was applied.The parity plot between the predicted and calculated in microkinetic analysis apparent activation energy can be found in Figure S14.The correlation is of good quality, as proven by the values of correlation coefficient of 0.98, and slope of 0.82.However, due to the very small size of the dataset, it was not possible to perform an external validation of the model.Yet, this approach demonstrates that using only two energy descriptors could be enough to properly describe the material performance and accelerate the whole analysis.S2).

Supporting Tables
S6) Note S1.3: the AutoProfLib class structureThe AutoProfLib generates the mechanism by producing the molecular adjacency matrix and connectivity dictionaries for all the intermediates in advance.The AutoProfLib parses the .xyzfiles provided by the user or created with the PreProccessor class, and stores the coordinates and labels of the atoms as a list of dictionaries.Subsequently, the adsorbed molecules are identified and separated from the surface.The connectivity between the molecules is then determined using the is_in_the_sphere function, which operates based on the scheme shown in FigureS2.The is_in_the_sphere function creates a sphere centered on each atom of the molecule, and the neighboring atoms that lie within the sphere's radius are deemed to be bonded to the central atom.For each iteration (referred to as i in FigureS2), the radius of the sphere increases until either: a) the maximum number of bonds of the central atom is achieved, or b) the radius is greater than or equal to the maximum interatomic distance of the central atom (as specified in the periodictable.csvfile).
k B T h (being T the temperature, and k B and h the Boltzmann and Planck constants respectively).As shown in Figure S12, our predicted surface population along time are identical to the ones presented in the example from 200 to 850 K. 6 Figure S6, showing that the workflow is fully automated until the kinetic input of CFD simulations.The AutoProfLib function uses the OpenFOAM_Mechanism function to convert the mechanism generated with AutoProfLib into elementary reactions, which are written in cat-alyticFoam syntax.Meanwhile, the energy profile is plotted and the intermediates, species in gas phase, and Transition State energies are exported in the g.mkm and the g_ref.mkmfiles.The rm.mkm and g.mkm (or g_ref.mkm)are the input files for pyMKM, which calculates the barriers (direct and forward) for each reaction in the rm.mkm file.Finally, the Auto_MKM_toOF.pyscript combines the AutoProfLib and pyMKM information, generating the OpenFOAM mechanism.

Figure S2 :
Figure S2: Scheme explaining how the is_in_the_sphere function works using the central carbon of 1-propanol as an example: a) iso-propanol graph, b) example of the is_in_the_sphere algorithm applied on the central atom of iso-propanol.

Figure S3 :
Figure S3: Example of using the connectivity dictionaries to generate the system adjacency matrix.

Figure S4 :
Figure S4: Scheme of the pyMKM internal workflow.MASI stands for Most Abundant Surface Intermediate.

Figure S5 :
Figure S5: Schematic representation of the CFD simulation meshes: a) General view of the simulated Fixed-Bed Reactor, b) Detail on the spheres of a), c) slice of b), d) Detail around one of the spheres shown in c), e) Slice of the general view of the single sphere model, f ) Zoom around e), showing the detail of the mesh, and g) Detail on f ).The single sphere model corresponds to M6 meshing, while the Fixed-Bed corresponds to M7 (see TableS2).

Figure S10 :
Figure S10: Reaction mechanism visualization for the CO 2 hydrogenation to methanol on Pd(111).

Figure S11 :
Figure S11: PyMKM species distribution benchmark for a generic species A being transformed to B, including the adsorption and desorption steps for both reactants and products at: a) 200 K, b) 350 K, and c) 800 K.The energies and the reference populations (in grey) were taken from reference 11, 6 and compared with our PyMKM predictions (black).

Figure S12 :
Figure S12: Reaction mechanism visualization for the CO 2 hydrogenation to methanol on the In 2 O 3 surface.

Figure S13 :
Figure S13: Apparent activation energy estimated with PyMKM for the CO 2 hydrogenation systems for the CO (RWGS) production for all the systems compared to previous computational results and experiments 10

Figure S14 :
Figure S14: Random Forest model for predicting the apparent activation energy of the methanol reaction.a) Parity plot between microkinetic model (MKM) output and random forest (RF) model for the investigated materials and b) related error distribution.

Table S1 :
Summary of the conversion (X) results of CFD simulations for the CO 2 hydrogenation on the Pd(111) and CP a cases and iPrOH on Co(0001) and Co(11 20).

Table S2 :
Summary of the conversion (X) results from CFD simulations for the iPrOH dehydrogenation on the Co(11 20) and CO 2 hydrogenation on the CPa model for a single sphere.

Table S3 :
10crokinetic estimation of the apparent activation Gibbs free energy (G a ) of methanol formation and the Reverse Water-Gas Shift (RWGS) as a function of the number of Pd atoms present in the Pd-doped In 2 O 3 (111) system.These estimated values were taken from Reference.10SystemExposedPd atoms G a,M eOH / kJ mol −1 G a,RW GS / kJ mol −1

Table S4 :
Mechanism of CO 2 hydrogenation to methanol on Pd(111)."*" denotes an adsorption site on the catalyst surface.

Table S6 :
Labels of reaction species (adsorbed and in gas phase) identified by AutoProfLib, together with their explanations.

Table S7 :
10rect and reverse barriers for the COOH and the CHOO intermediates formation, G a,COOH and G a,CHOO respectively.Barriers obtained from the data provided in Ref.10G a,COOH / eV G a,CHOO / eV Model direct reverse direct reverse CP a

Table S8 :
Degree of Rate Control (DRC) analysis for the CP a model in CO 2 hydrogenation to methanol and reverse water-gas shift (RWGS).The intermediates i are labeled corresponding to TableS6.